Theano tensor 模块:conv 子模块

convtensor 中处理卷积神经网络的子模块。

卷积

这里只介绍二维卷积:

T.nnet.conv2d(input, filters, input_shape=None, filter_shape=None, border_mode='valid', subsample=(1, 1), filter_flip=True, image_shape=None, **kwargs)

conv2d 函数接受两个输入:

  • 4D 张量 input,其形状如下:

    [b, ic, i0, i1]

  • 4D 张量 filter ,其形状如下:

    [oc, ic, f0, f1]

border_mode 控制输出大小:

  • 'valid':输出形状:

    [b, oc, i0 - f0 + 1, i1 - f1 + 1]

  • 'full':输出形状:

    [b, oc, i0 + f0 - 1, i1 + f1 - 1]

池化

池化操作:

T.signal.downsample.max_pool_2d(input, ds, ignore_border=None, st=None, padding=(0, 0), mode='max')

input 池化操作在其最后两维进行。

ds 是池化区域的大小,用长度为 2 的元组表示。

ignore_border 设为 Ture 时,(5, 5)(2, 2) 的池化下会变成 (2, 2)(5 % 2 == 1,多余的 1 个被舍去了),否则是 (3, 3)

MNIST 卷积神经网络形状详解

def model(X, w, w2, w3, w4, p_drop_conv, p_drop_hidden):

    # X:  128 * 1 * 28 * 28
    # w:  32 * 1 * 3 * 3
    # full mode
    # l1a: 128 * 32 * (28 + 3 - 1) * (28 + 3 - 1)
    l1a = rectify(conv2d(X, w, border_mode='full'))
    # l1a: 128 * 32 * 30 * 30
    # ignore_border False
    # l1:  128 * 32 * (30 / 2) * (30 / 2)
    l1 = max_pool_2d(l1a, (2, 2), ignore_border=False)
    l1 = dropout(l1, p_drop_conv)

    # l1:  128 * 32 * 15 * 15
    # w2:  64 * 32 * 3 * 3
    # valid mode
    # l2a: 128 * 64 * (15 - 3 + 1) * (15 - 3 + 1)
    l2a = rectify(conv2d(l1, w2))    
    # l2a: 128 * 64 * 13 * 13
    # l2:  128 * 64 * (13 / 2 + 1) * (13 / 2 + 1)
    l2 = max_pool_2d(l2a, (2, 2), ignore_border=False)
    l2 = dropout(l2, p_drop_conv)

    # l2:  128 * 64 * 7 * 7
    # w3:  128 * 64 * 3 * 3
    # l3a: 128 * 128 * (7 - 3 + 1) * (7 - 3 + 1)
    l3a = rectify(conv2d(l2, w3))
    # l3a: 128 * 128 * 5 * 5
    # l3b: 128 * 128 * (5 / 2 + 1) * (5 / 2 + 1)
    l3b = max_pool_2d(l3a, (2, 2), ignore_border=False)    
    # l3b: 128 * 128 * 3 * 3
    # l3:  128 * (128 * 3 * 3)
    l3 = T.flatten(l3b, outdim=2)
    l3 = dropout(l3, p_drop_conv)

    # l3: 128 * (128 * 3 * 3)
    # w4: (128 * 3 * 3) * 625
    # l4: 128 * 625
    l4 = rectify(T.dot(l3, w4))
    l4 = dropout(l4, p_drop_hidden)

    # l5:  128 * 625
    # w5:  625 * 10
    # pyx: 128 * 10
    pyx = softmax(T.dot(l4, w_o))
    return l1, l2, l3, l4, pyx